Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add precipitation histogram to prognostic run report #1271

Merged
merged 20 commits into from
Jun 23, 2021

Conversation

oliverwm1
Copy link
Contributor

@oliverwm1 oliverwm1 commented Jun 18, 2021

Useful to show a histogram of precipitation on prognostic run reports. Example report HERE.

Significant internal changes:

  • Added function to compute histograms of arbitrary variables. For now, just doing surface precipitation rate.
  • Create new process_diagnostics.html page to the prog run report. Moved diurnal cycle there, and added plot of precipitation histogram.
  • Added metric to compute percentiles from above histograms. For now, computing 25th, 50th, 75th, 90, 99, and 99.9th percentiles. e.g.:
"percentile_25/total_precip_to_surface": {
      "value": 0.3747434426709828,
      "units": "mm/day"
}

Sped up report test by only creating report for a single run (since no longer require >1 run for report generation).

  • Tests added

@@ -229,16 +227,6 @@ def _assign_diagnostic_time_attrs(
return diagnostics_ds


def dump_nc(ds: xr.Dataset, f):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We switched to using the vcm.dump_nc version of this a while ago, so this func is unused.

logger.info("Computing histograms for physics diagnostics")
counts = xr.Dataset()
for varname in prognostic.data_vars:
count, bins = np.histogram(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found using np.histogram much faster (and simpler code-wise) than doing an xarray groupby.

@oliverwm1 oliverwm1 marked this pull request as ready for review June 22, 2021 23:16
def plot_1d(
run_diags: RunDiagnostics, varfilter: str, run_attr_name: str = "run",
) -> HVPlot:
def plot_1d(run_diags: RunDiagnostics, varfilter: str) -> HVPlot:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted unused argument run_attr_name

Copy link
Contributor

@nbren12 nbren12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Apart from some minor comments below, I had some thoughts

  1. I was pretty confused about why the metric functions returned xr.Dataset rather than floats, before realizing that the to_dict function looks at the units and values. It would be clearer if the metrics functions returned dataclasses with .value and .units attrs, so that we are passing the minimal amount of data around.
  2. It would be great to enhance the visibility of the scalar metrics, so that they are amongst the first things we see. e.g. by showing a table instead of the histograms. The histograms are kind of useless and ugly IMO, especially for a single run report.

@oliverwm1
Copy link
Contributor Author

oliverwm1 commented Jun 23, 2021

I was pretty confused about why the metric functions returned xr.Dataset rather than floats, before realizing that the to_dict function looks at the units and values. It would be clearer if the metrics functions returned dataclasses with .value and .units attrs, so that we are passing the minimal amount of data around.

yeah I agree the xr.Dataset is a bit overkill. We could improve this in another PR.

It would be great to enhance the visibility of the scalar metrics, so that they are amongst the first things we see. e.g. by showing a table instead of the histograms. The histograms are kind of useless and ugly IMO, especially for a single run report.

ouch, poor histogram ;) It would be helpful to have the verification data on there, I agree. The example is particularly noisy since it's a short run. But yeah, a table showing the percentiles for each run would be quite useful.

@oliverwm1 oliverwm1 enabled auto-merge (squash) June 23, 2021 22:11
@oliverwm1 oliverwm1 merged commit 638006a into master Jun 23, 2021
@oliverwm1 oliverwm1 deleted the feature/process-page branch June 23, 2021 22:11
@nbren12
Copy link
Contributor

nbren12 commented Jun 28, 2021

ouch, poor histogram ;)

sorry. I didn't mean your histograms, i meant the bar charts with the metrics on the main page. The histograms are great!

@oliverwm1
Copy link
Contributor Author

ouch, poor histogram ;)

sorry. I didn't mean your histograms, i meant the bar charts with the metrics on the main page. The histograms are great!

haha sounds good. I agree the bar charts are hard to parse. I'll work on adding a metrics table with the things we actually care about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants